Bellman : A Data Quality Browser
نویسندگان
چکیده
When a data analyst starts a new project, she is often presented with one or more very large databases (containing hundreds or thousands of tables). Extracting useful information from the databases can be a difficult problem: documentation is usually minimal, the data is poorly structured and difficult to join, and the quality of the data is often poor. As an aid in exploratory analysis, we are developing a data quality browser that allows the analyst to quickly gain an understanding of the contents of the tables and their relationships. In addition, the browser serves as a platform for issuing data mining queries targeted towards a further understanding of data quality problems. We illustrate the utility of the data quality browser with several examples.
منابع مشابه
The Bellman Data Quality Browser
Keynote Talk Abstract Data quality is a serious concern in complex industrial-scale databases, which often have thousands of tables and tens of thousands of columns. Commonly encountered problems include missing data (null values), duplicates and default values in columns supposed to treated as keys, data inconsistencies (violation of functional dependencies), and poor quality join paths (lack ...
متن کاملAPPLICATION OF THE BELLMAN AND ZADEH'S PRINCIPLE FOR IDENTIFYING THE FUZZY DECISION IN A NETWORK WITH INTERMEDIATE STORAGE
In most of the real-life applications we deal with the problem of transporting some special fruits, as banana, which has particular production and distribution processes. In this paper we restrict our attention to formulating and solving a new bi-criterion problem on a network in which in addition to minimizing the traversing costs, admissibility of the quality level of fruits is a main objecti...
متن کاملSushi.R: flexible, quantitative and integrative genomic visualizations for publication-quality multi-panel figures
MOTIVATION Interpretation and communication of genomic data require flexible and quantitative tools to analyze and visualize diverse data types, and yet, a comprehensive tool to display all common genomic data types in publication quality figures does not exist to date. To address this shortcoming, we present Sushi.R, an R/Bioconductor package that allows flexible integration of genomic visuali...
متن کاملInternet QoS Routing Using the Bellman-Ford Algorithm
Multimedia applications are Quality of Service (QoS) sensitive, which makes QoS support indispensable in high speed Integrated Services Packet Networks (ISPN). An important aspect is QoS routing, namely, the provision of QoS routes at session set up time based on user request and information about available network resources. This paper develops optimal QoS routing algorithms within an Autonomo...
متن کاملطراحی وب سرویس مدیریت امدادرسانی پس از وقوع سیل با کمک اطلاعات جغرافیایی داوطلبانه (VGI) بر مبنای تکنولوژی متن باز
Accessibility to precise spatial and real time data plays a valuable role in the velocity and quality of flood relief operation and subsequently, scales the human and financial losses down. Flood real time data collection and processing, for instance, precise location and situation of flood victims may be a big challenge in Iran regarding the hardware facilities (such as high resolution aerial ...
متن کامل